Word Order Does NOT Differ Significantly Between Chinese and Japanese

نویسندگان

  • Chenchen Ding
  • Masao Utiyama
  • Eiichiro Sumita
  • Mikio Yamamoto
چکیده

We propose a pre-reordering approach for Japanese-to-Chinese statistical machine translation (SMT). The approach uses dependency structure and manually designed reordering rules to arrange morphemes of Japanese sentences into Chinese-like word order, before a baseline phrase-based (PB) SMT system applied. Experimental results on the ASPEC-JC data show that the improvement of the proposed pre-reordering approach is slight on BLEU and mediocre on RIBES, compared with the organizer’s baseline PB SMT system. The approach also shows improvement in human evaluation. We observe the word order does not differ much in the two languages, though Japanese is a subject-object-verb (SOV) language and Chinese is an SVO language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of the Impact of Word Segmentation on Name Tagging for Chinese and Japanese

Word Segmentation is usually considered an essential step for many Chinese and Japanese Natural Language Processing tasks, such as name tagging. This paper presents several new observations and analysis on the impact of word segmentation on name tagging; (1). Due to the limitation of current state-of-the-art Chinese word segmentation performance, a character-based name tagger can outperform its...

متن کامل

Exploiting Shared Chinese Characters in Chinese Word Segmentation Optimization for Chinese-Japanese Machine Translation

Unknown words and word segmentation granularity are two main problems in Chinese word segmentation for ChineseJapanese Machine Translation (MT). In this paper, we propose an approach of exploiting common Chinese characters shared between Chinese and Japanese in Chinese word segmentation optimization for MT aiming to solve these problems. We augment the system dictionary of a Chinese segmenter b...

متن کامل

Bilingualism, Biliteracy and Metalinguistic Awareness: Word Awareness in English and Japanese Users of Chinese as a Second Language

Cross-linguistic research shows that some aspects of metalinguistic awareness are affected by characteristics of different writing systems. Users of writing systems that mark word boundaries (such as English) develop word awareness, while users of unspaced writing systems (such as Chinese) do not. Previous research showed that English-speaking users of Chinese as a Second Language (CSL) have hi...

متن کامل

The effect of canonical word order on the production and comprehension of pseudoclefts in L2

This study investigated the effect of word order and age on the production and comprehension of pseudoclefts in L2 across two experiments. For each experiment 16 female students aged between 179 and 210 months were recruited from a secondary school. These students were divided into two groups based on their age range; one group for investigating the effect of word order and age on the productio...

متن کامل

Chinese and Japanese Word Segmentation Using Word-Level and Character-Level Information

In this paper, we present a hybrid method for Chinese and Japanese word segmentation. Word-level information is useful for analysis of known words, while character-level information is useful for analysis of unknown words, and the method utilizes both these two types of information in order to effectively handle known and unknown words. Experimental results show that this method achieves high o...

متن کامل

Japanese Kanji Word Processing for Chinese Learners of Japanese: A Study of Homophonic and Semantic Primed Lexical Decision Tasks

The current study investigates phonological involvement in Japanese word recognition by advanced and intermediate Chinese learners. A homophonic, semantic and unrelated (control) primed lexical decision task was used to test the participants’ reactions times (RTs) and accuracy scores. Only the RTs of the participants’ accurate YES responses in the lexical decision task (yes/no) were used as dep...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014